Efficient and Anonymous Web-Usage Mining for Web Personalization

نویسندگان

  • Cyrus Shahabi
  • Farnoush Banaei Kashani
چکیده

The World Wide Web (WWW) is the largest distributed information space and has grown to encompass diverse information resources. Although the web is growing exponentially, the individual’s capacity to read and digest content is essentially fixed. The full economic potential of the web will not be realized unless enabling technologies are provided to facilitate access to web resources. Currently web personalization is the most promising approach to remedy this problem, and web mining, particularly web-usage mining, is considered a crucial component of any efficacious web-personalization system. In this paper, we describe a complete framework for web-usage mining to satisfy the challenging requirements of web-personalization applications. For online and anonymous web personalization to be effective, web usage mining must be accomplished in real time as accurately as possible. On the other hand, web-usage mining should allow a compromise between scalability and accuracy to be applicable to real-life websites with numerous visitors. Within our web-usage-mining framework, we introduce a distributed user-tracking approach for accurate, scalable, and implicit collection of the usage data. We also propose a new model, the feature-matrices (FM) model, to discover and interpret users’ access patterns. With FM, various spatial and temporal features of usage data can be captured with flexible precision so that we can trade off accuracy for scalability based on the specific application requirements. Moreover, tunable complexity of the FM model allows real-time and adaptive access pattern discovery from usage data. We define a novel similarity measure based on FM that is specifically designed for accurate classification of partial navigation patterns in real time. Our extensive experiments with both synthetic and real data verify correctness and efficacy of our web-usage-mining framework for anonymous and efficient web personalization. (Web-Usage Mining; Data Mining; Personalization; Pattern Discovery )

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Framework for Efficient and Anonymous Web Usage Mining Based on Client-Side Tracking

Web Usage Mining (WUM), a natural application of data mining techniques to the data collected from user interactions with the web, has greatly concerned both academia and industry in recent years. Through WUM, we are able to gain a better understanding of both the web and web user access patterns; a knowledge that is crucial for realization of full economic potential of the web. In this chapter...

متن کامل

WebPersonalizer: A Server-Side Recommender System Based on Web Usage Mining

Existing approaches to Web personalization often rely heavily on explicit and subjective user input resulting in static profiles which are prone to biases. In this paper we present a usagebased Web personalization system, called WebPersonalizer, drawing heavily upon Web mining techniques, making the personalization process automatic, and dynamic. The system architecture separates the offline ta...

متن کامل

Comprehensive Survey of Framework for Web Personalization using Web Mining

World Wide Web is a global village and a rich source of information. The number of users accessing web sites is increasing day by day. For effective and efficient handling, web mining coupled with recommendation techniques provides personalized contents at the disposal of users. Web Mining is an area of Data Mining dealing with the extraction of interesting knowledge from the World Wide Web. Wh...

متن کامل

Survey on Particle Swarm Optimization Based Web Mining

Web Mining is a challenging task that searches for Web access patterns, Web structures and the regularity and dynamics of the Web contents. It provides efficient Web Personalization, System Improvement, Site Modification, Business Intelligence and Usage Characterization. High-dimensional Web Log File clustering is a challenging task and requires an efficient clustering technique. The efficiency...

متن کامل

A Least Square Approach to Anlayze Usage Data for Effective Web Personalization

Web server logs have abundant information about the nature of users accessing it. Web usage mining, in conjunction with standard approaches to personalization helps to address some of the shortcomings of these techniques, including reliance on subjective lack of scalability, poor performance, user ratings and sparse data. But, it is not sufficient to discover patterns from usage data for perfor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • INFORMS Journal on Computing

دوره 15  شماره 

صفحات  -

تاریخ انتشار 2003